Overview

Dataset statistics

Number of variables40
Number of observations32950
Missing cells0
Missing cells (%)0.0%
Duplicate rows11
Duplicate rows (%)< 0.1%
Total size in memory5.5 MiB
Average record size in memory174.0 B

Variable types

BOOL28
NUM12

Warnings

Dataset has 11 (< 0.1%) duplicate rows Duplicates
poutcome is highly correlated with pdaysHigh correlation
pdays is highly correlated with poutcomeHigh correlation
euribor3m is highly correlated with emp.var.rate and 1 other fieldsHigh correlation
emp.var.rate is highly correlated with euribor3m and 1 other fieldsHigh correlation
nr.employed is highly correlated with emp.var.rate and 1 other fieldsHigh correlation
contact_telephone is highly correlated with contact_cellularHigh correlation
contact_cellular is highly correlated with contact_telephoneHigh correlation
previous has 28394 (86.2%) zeros Zeros

Reproduction

Analysis started2021-01-21 11:25:10.822675
Analysis finished2021-01-21 11:25:55.111505
Duration44.29 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

age
Real number (ℝ≥0)

Distinct77
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.04021244
Minimum17
Maximum98
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum17
5-th percentile26
Q132
median38
Q347
95-th percentile58
Maximum98
Range81
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.43231316
Coefficient of variation (CV)0.2605458993
Kurtosis0.7718456354
Mean40.04021244
Median Absolute Deviation (MAD)7
Skewness0.777049751
Sum1319325
Variance108.8331579
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3115474.7%
 
3214634.4%
 
3314304.3%
 
3514264.3%
 
3614094.3%
 
3413924.2%
 
3013814.2%
 
2911733.6%
 
3711733.6%
 
3911593.5%
 
3811133.4%
 
4110373.1%
 
409292.8%
 
429092.8%
 
458832.7%
 
438282.5%
 
448172.5%
 
288102.5%
 
468102.5%
 
487702.3%
 
477432.3%
 
507012.1%
 
496962.1%
 
276832.1%
 
526401.9%
 
Other values (52)702821.3%
 
ValueCountFrequency (%) 
174< 0.1%
 
18230.1%
 
19330.1%
 
20530.2%
 
21890.3%
 
221090.3%
 
231850.6%
 
243701.1%
 
254771.4%
 
265481.7%
 
ValueCountFrequency (%) 
982< 0.1%
 
951< 0.1%
 
941< 0.1%
 
923< 0.1%
 
912< 0.1%
 
892< 0.1%
 
8816< 0.1%
 
867< 0.1%
 
8513< 0.1%
 
844< 0.1%
 

marital
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size514.8 KiB
1
19966 
0
12984 
ValueCountFrequency (%) 
11996660.6%
 
01298439.4%
 

default
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size514.8 KiB
0
32947 
1
 
3
ValueCountFrequency (%) 
032947> 99.9%
 
13< 0.1%
 

housing
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size514.8 KiB
1
17232 
0
15718 
ValueCountFrequency (%) 
11723252.3%
 
01571847.7%
 

loan
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size514.8 KiB
0
27948 
1
5002 
ValueCountFrequency (%) 
02794884.8%
 
1500215.2%
 

month
Real number (ℝ≥0)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.605280728
Minimum3
Maximum12
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum3
5-th percentile4
Q15
median6
Q38
95-th percentile11
Maximum12
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.041098571
Coefficient of variation (CV)0.309010117
Kurtosis-0.02206753928
Mean6.605280728
Median Absolute Deviation (MAD)1
Skewness0.859118905
Sum217644
Variance4.166083376
MonotocityNot monotonic
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
51107333.6%
 
7575317.5%
 
8488214.8%
 
6425112.9%
 
11329610.0%
 
421086.4%
 
105691.7%
 
94531.4%
 
34211.3%
 
121440.4%
 
ValueCountFrequency (%) 
34211.3%
 
421086.4%
 
51107333.6%
 
6425112.9%
 
7575317.5%
 
8488214.8%
 
94531.4%
 
105691.7%
 
11329610.0%
 
121440.4%
 
ValueCountFrequency (%) 
121440.4%
 
11329610.0%
 
105691.7%
 
94531.4%
 
8488214.8%
 
7575317.5%
 
6425112.9%
 
51107333.6%
 
421086.4%
 
34211.3%
 

day_of_week
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.980789074
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.41158046
Coefficient of variation (CV)0.4735593242
Kurtosis-1.301078429
Mean2.980789074
Median Absolute Deviation (MAD)1
Skewness-0.001154769638
Sum98217
Variance1.992559394
MonotocityNot monotonic
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
4693721.1%
 
1680220.6%
 
2648419.7%
 
3646819.6%
 
5625919.0%
 
ValueCountFrequency (%) 
1680220.6%
 
2648419.7%
 
3646819.6%
 
4693721.1%
 
5625919.0%
 
ValueCountFrequency (%) 
5625919.0%
 
4693721.1%
 
3646819.6%
 
2648419.7%
 
1680220.6%
 

duration
Real number (ℝ≥0)

Distinct1463
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean257.3352049
Minimum0
Maximum4918
Zeros4
Zeros (%)< 0.1%
Memory size514.8 KiB

Quantile statistics

Minimum0
5-th percentile36
Q1102
median179
Q3318
95-th percentile746
Maximum4918
Range4918
Interquartile range (IQR)216

Descriptive statistics

Standard deviation257.3316998
Coefficient of variation (CV)0.9999863793
Kurtosis20.16816732
Mean257.3352049
Median Absolute Deviation (MAD)93
Skewness3.24507812
Sum8479195
Variance66219.60371
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1361440.4%
 
851390.4%
 
921380.4%
 
721370.4%
 
1241340.4%
 
1351320.4%
 
1261320.4%
 
971320.4%
 
1041300.4%
 
821290.4%
 
931290.4%
 
901290.4%
 
891290.4%
 
1141280.4%
 
731280.4%
 
871280.4%
 
1061270.4%
 
881260.4%
 
1071260.4%
 
1221250.4%
 
1121250.4%
 
1091230.4%
 
1271220.4%
 
831220.4%
 
1111220.4%
 
Other values (1438)2971490.2%
 
ValueCountFrequency (%) 
04< 0.1%
 
13< 0.1%
 
21< 0.1%
 
33< 0.1%
 
410< 0.1%
 
5240.1%
 
6310.1%
 
7380.1%
 
8540.2%
 
9670.2%
 
ValueCountFrequency (%) 
49181< 0.1%
 
37851< 0.1%
 
36431< 0.1%
 
35091< 0.1%
 
34221< 0.1%
 
33661< 0.1%
 
33221< 0.1%
 
32841< 0.1%
 
32531< 0.1%
 
31831< 0.1%
 

campaign
Real number (ℝ≥0)

Distinct39
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.561729894
Minimum1
Maximum56
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile7
Maximum56
Range55
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.763646396
Coefficient of variation (CV)1.078820371
Kurtosis37.7007375
Mean2.561729894
Median Absolute Deviation (MAD)1
Skewness4.791548913
Sum84409
Variance7.6377414
MonotocityNot monotonic
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%) 
11414842.9%
 
2844725.6%
 
3427713.0%
 
420976.4%
 
512953.9%
 
67762.4%
 
75051.5%
 
83201.0%
 
92170.7%
 
101780.5%
 
111370.4%
 
121030.3%
 
13690.2%
 
14570.2%
 
17440.1%
 
15430.1%
 
16410.1%
 
20260.1%
 
18260.1%
 
19180.1%
 
21170.1%
 
2215< 0.1%
 
2315< 0.1%
 
2413< 0.1%
 
2910< 0.1%
 
Other values (14)560.2%
 
ValueCountFrequency (%) 
11414842.9%
 
2844725.6%
 
3427713.0%
 
420976.4%
 
512953.9%
 
67762.4%
 
75051.5%
 
83201.0%
 
92170.7%
 
101780.5%
 
ValueCountFrequency (%) 
561< 0.1%
 
432< 0.1%
 
422< 0.1%
 
402< 0.1%
 
355< 0.1%
 
343< 0.1%
 
333< 0.1%
 
322< 0.1%
 
316< 0.1%
 
303< 0.1%
 

pdays
Real number (ℝ≥0)

HIGH CORRELATION

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean962.17478
Minimum0
Maximum999
Zeros12
Zeros (%)< 0.1%
Memory size514.8 KiB

Quantile statistics

Minimum0
5-th percentile999
Q1999
median999
Q3999
95-th percentile999
Maximum999
Range999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation187.6467855
Coefficient of variation (CV)0.1950235959
Kurtosis22.00769747
Mean962.17478
Median Absolute Deviation (MAD)0
Skewness-4.899583916
Sum31703659
Variance35211.31611
MonotocityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%) 
9993172896.3%
 
33501.1%
 
63261.0%
 
4930.3%
 
2550.2%
 
9540.2%
 
12480.1%
 
7470.1%
 
5390.1%
 
10390.1%
 
13300.1%
 
11240.1%
 
1220.1%
 
15220.1%
 
1416< 0.1%
 
813< 0.1%
 
012< 0.1%
 
168< 0.1%
 
177< 0.1%
 
187< 0.1%
 
193< 0.1%
 
212< 0.1%
 
222< 0.1%
 
261< 0.1%
 
201< 0.1%
 
ValueCountFrequency (%) 
012< 0.1%
 
1220.1%
 
2550.2%
 
33501.1%
 
4930.3%
 
5390.1%
 
63261.0%
 
7470.1%
 
813< 0.1%
 
9540.2%
 
ValueCountFrequency (%) 
9993172896.3%
 
271< 0.1%
 
261< 0.1%
 
222< 0.1%
 
212< 0.1%
 
201< 0.1%
 
193< 0.1%
 
187< 0.1%
 
177< 0.1%
 
168< 0.1%
 

previous
Real number (ℝ≥0)

ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1747799697
Minimum0
Maximum7
Zeros28394
Zeros (%)86.2%
Memory size514.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4965033629
Coefficient of variation (CV)2.840733775
Kurtosis19.94757787
Mean0.1747799697
Median Absolute Deviation (MAD)0
Skewness3.808522604
Sum5759
Variance0.2465155894
MonotocityNot monotonic
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
02839486.2%
 
1370311.2%
 
26031.8%
 
31750.5%
 
4560.2%
 
514< 0.1%
 
64< 0.1%
 
71< 0.1%
 
ValueCountFrequency (%) 
02839486.2%
 
1370311.2%
 
26031.8%
 
31750.5%
 
4560.2%
 
514< 0.1%
 
64< 0.1%
 
71< 0.1%
 
ValueCountFrequency (%) 
71< 0.1%
 
64< 0.1%
 
514< 0.1%
 
4560.2%
 
31750.5%
 
26031.8%
 
1370311.2%
 
02839486.2%
 

poutcome
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size514.8 KiB
0
31845 
1
 
1105
ValueCountFrequency (%) 
03184596.6%
 
111053.4%
 

emp.var.rate
Real number (ℝ)

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0762276176
Minimum-3.4
Maximum1.4
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum-3.4
5-th percentile-2.9
Q1-1.8
median1.1
Q31.4
95-th percentile1.4
Maximum1.4
Range4.8
Interquartile range (IQR)3.2

Descriptive statistics

Standard deviation1.572241965
Coefficient of variation (CV)20.62562119
Kurtosis-1.073664391
Mean0.0762276176
Median Absolute Deviation (MAD)0.3
Skewness-0.7169400989
Sum2511.7
Variance2.471944796
MonotocityNot monotonic
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.41292739.2%
 
-1.8739222.4%
 
1.1621018.8%
 
-0.129609.0%
 
-2.913484.1%
 
-3.48542.6%
 
-1.76111.9%
 
-1.15041.5%
 
-31360.4%
 
-0.28< 0.1%
 
ValueCountFrequency (%) 
-3.48542.6%
 
-31360.4%
 
-2.913484.1%
 
-1.8739222.4%
 
-1.76111.9%
 
-1.15041.5%
 
-0.28< 0.1%
 
-0.129609.0%
 
1.1621018.8%
 
1.41292739.2%
 
ValueCountFrequency (%) 
1.41292739.2%
 
1.1621018.8%
 
-0.129609.0%
 
-0.28< 0.1%
 
-1.15041.5%
 
-1.76111.9%
 
-1.8739222.4%
 
-2.913484.1%
 
-31360.4%
 
-3.48542.6%
 

cons.price.idx
Real number (ℝ≥0)

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93.57424343
Minimum92.201
Maximum94.767
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum92.201
5-th percentile92.713
Q193.075
median93.749
Q393.994
95-th percentile94.465
Maximum94.767
Range2.566
Interquartile range (IQR)0.919

Descriptive statistics

Standard deviation0.5786358031
Coefficient of variation (CV)0.006183708057
Kurtosis-0.8359512246
Mean93.57424343
Median Absolute Deviation (MAD)0.38
Skewness-0.2267829565
Sum3083271.321
Variance0.3348193926
MonotocityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%) 
93.994621018.8%
 
93.918536416.3%
 
92.893468714.2%
 
93.444408912.4%
 
94.465347410.5%
 
93.229078.8%
 
93.07519666.0%
 
92.2016131.9%
 
92.9635951.8%
 
92.4313541.1%
 
92.6492860.9%
 
94.2152490.8%
 
94.1992390.7%
 
92.3792140.6%
 
92.8432120.6%
 
93.3692090.6%
 
94.0551820.6%
 
94.0271800.5%
 
93.8761760.5%
 
94.6011620.5%
 
93.7491420.4%
 
92.4691400.4%
 
92.7131360.4%
 
94.7671030.3%
 
93.798530.2%
 
ValueCountFrequency (%) 
92.2016131.9%
 
92.3792140.6%
 
92.4313541.1%
 
92.4691400.4%
 
92.6492860.9%
 
92.7131360.4%
 
92.7568< 0.1%
 
92.8432120.6%
 
92.893468714.2%
 
92.9635951.8%
 
ValueCountFrequency (%) 
94.7671030.3%
 
94.6011620.5%
 
94.465347410.5%
 
94.2152490.8%
 
94.1992390.7%
 
94.0551820.6%
 
94.0271800.5%
 
93.994621018.8%
 
93.918536416.3%
 
93.8761760.5%
 

cons.conf.idx
Real number (ℝ)

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-40.51867982
Minimum-50.8
Maximum-26.9
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum-50.8
5-th percentile-47.1
Q1-42.7
median-41.8
Q3-36.4
95-th percentile-33.6
Maximum-26.9
Range23.9
Interquartile range (IQR)6.3

Descriptive statistics

Standard deviation4.623004314
Coefficient of variation (CV)-0.1140956303
Kurtosis-0.3569941733
Mean-40.51867982
Median Absolute Deviation (MAD)4.4
Skewness0.310353195
Sum-1335090.5
Variance21.37216888
MonotocityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%) 
-36.4621018.8%
 
-42.7536416.3%
 
-46.2468714.2%
 
-36.1408912.4%
 
-41.8347410.5%
 
-4229078.8%
 
-47.119666.0%
 
-31.46131.9%
 
-40.85951.8%
 
-26.93541.1%
 
-30.12860.9%
 
-40.32490.8%
 
-37.52390.7%
 
-29.82140.6%
 
-502120.6%
 
-34.82090.6%
 
-39.81820.6%
 
-38.31800.5%
 
-401760.5%
 
-49.51620.5%
 
-34.61420.4%
 
-33.61400.4%
 
-331360.4%
 
-50.81030.3%
 
-40.4530.2%
 
ValueCountFrequency (%) 
-50.81030.3%
 
-502120.6%
 
-49.51620.5%
 
-47.119666.0%
 
-46.2468714.2%
 
-45.98< 0.1%
 
-42.7536416.3%
 
-4229078.8%
 
-41.8347410.5%
 
-40.85951.8%
 
ValueCountFrequency (%) 
-26.93541.1%
 
-29.82140.6%
 
-30.12860.9%
 
-31.46131.9%
 
-331360.4%
 
-33.61400.4%
 
-34.61420.4%
 
-34.82090.6%
 
-36.1408912.4%
 
-36.4621018.8%
 

euribor3m
Real number (ℝ≥0)

HIGH CORRELATION

Distinct314
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.615653596
Minimum0.634
Maximum5.045
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum0.634
5-th percentile0.797
Q11.344
median4.857
Q34.961
95-th percentile4.966
Maximum5.045
Range4.411
Interquartile range (IQR)3.617

Descriptive statistics

Standard deviation1.73574798
Coefficient of variation (CV)0.4800647889
Kurtosis-1.416898308
Mean3.615653596
Median Absolute Deviation (MAD)0.108
Skewness-0.7021616926
Sum119135.786
Variance3.012821051
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.85722716.9%
 
4.96220616.3%
 
4.96319686.0%
 
4.96115124.6%
 
4.8569833.0%
 
4.9649322.8%
 
1.4059242.8%
 
4.8648492.6%
 
4.9658452.6%
 
4.968112.5%
 
4.9687882.4%
 
4.9597172.2%
 
4.867152.2%
 
4.8556682.0%
 
4.0766582.0%
 
1.2666542.0%
 
4.8596291.9%
 
4.125941.8%
 
4.8585811.8%
 
4.1535611.7%
 
4.0215481.7%
 
4.9675181.6%
 
1.2815111.6%
 
4.1914971.5%
 
4.9664891.5%
 
Other values (289)1066632.4%
 
ValueCountFrequency (%) 
0.6346< 0.1%
 
0.635320.1%
 
0.63610< 0.1%
 
0.6376< 0.1%
 
0.6386< 0.1%
 
0.63916< 0.1%
 
0.649< 0.1%
 
0.642320.1%
 
0.643200.1%
 
0.644290.1%
 
ValueCountFrequency (%) 
5.0457< 0.1%
 
57< 0.1%
 
4.971420.4%
 
4.9687882.4%
 
4.9675181.6%
 
4.9664891.5%
 
4.9658452.6%
 
4.9649322.8%
 
4.96319686.0%
 
4.96220616.3%
 

nr.employed
Real number (ℝ≥0)

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5166.859608
Minimum4963.6
Maximum5228.1
Zeros0
Zeros (%)0.0%
Memory size514.8 KiB

Quantile statistics

Minimum4963.6
5-th percentile5017.5
Q15099.1
median5191
Q35228.1
95-th percentile5228.1
Maximum5228.1
Range264.5
Interquartile range (IQR)129

Descriptive statistics

Standard deviation72.20844837
Coefficient of variation (CV)0.01397530683
Kurtosis-0.01823278267
Mean5166.859608
Median Absolute Deviation (MAD)37.1
Skewness-1.03710511
Sum170248024.1
Variance5214.060016
MonotocityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
5228.11292739.2%
 
5099.1686520.8%
 
5191621018.8%
 
5195.829609.0%
 
5076.213484.1%
 
5017.58542.6%
 
4991.66111.9%
 
5008.75271.6%
 
4963.65041.5%
 
5023.51360.4%
 
5176.38< 0.1%
 
ValueCountFrequency (%) 
4963.65041.5%
 
4991.66111.9%
 
5008.75271.6%
 
5017.58542.6%
 
5023.51360.4%
 
5076.213484.1%
 
5099.1686520.8%
 
5176.38< 0.1%
 
5191621018.8%
 
5195.829609.0%
 
ValueCountFrequency (%) 
5228.11292739.2%
 
5195.829609.0%
 
5191621018.8%
 
5176.38< 0.1%
 
5099.1686520.8%
 
5076.213484.1%
 
5023.51360.4%
 
5017.58542.6%
 
5008.75271.6%
 
4991.66111.9%
 

job_admin.
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
24611 
1
8339 
ValueCountFrequency (%) 
02461174.7%
 
1833925.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
25594 
1
7356 
ValueCountFrequency (%) 
02559477.7%
 
1735622.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
31757 
1
 
1193
ValueCountFrequency (%) 
03175796.4%
 
111933.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
32096 
1
 
854
ValueCountFrequency (%) 
03209697.4%
 
18542.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
30607 
1
 
2343
ValueCountFrequency (%) 
03060792.9%
 
123437.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
31562 
1
 
1388
ValueCountFrequency (%) 
03156295.8%
 
113884.2%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
31810 
1
 
1140
ValueCountFrequency (%) 
03181096.5%
 
111403.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
29789 
1
3161 
ValueCountFrequency (%) 
02978990.4%
 
131619.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
32268 
1
 
682
ValueCountFrequency (%) 
03226897.9%
 
16822.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
27524 
1
5426 
ValueCountFrequency (%) 
02752483.5%
 
1542616.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
32157 
1
 
793
ValueCountFrequency (%) 
03215797.6%
 
17932.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
32675 
1
 
275
ValueCountFrequency (%) 
03267599.2%
 
12750.8%
 

contact_cellular
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
1
20946 
0
12004 
ValueCountFrequency (%) 
12094663.6%
 
01200436.4%
 

contact_telephone
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
20946 
1
12004 
ValueCountFrequency (%) 
02094663.6%
 
11200436.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
29617 
1
3333 
ValueCountFrequency (%) 
02961789.9%
 
1333310.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
31103 
1
 
1847
ValueCountFrequency (%) 
03110394.4%
 
118475.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
28090 
1
4860 
ValueCountFrequency (%) 
02809085.3%
 
1486014.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
25397 
1
7553 
ValueCountFrequency (%) 
02539777.1%
 
1755322.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
32935 
1
 
15
ValueCountFrequency (%) 
032935> 99.9%
 
115< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
28721 
1
4229 
ValueCountFrequency (%) 
02872187.2%
 
1422912.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
23233 
1
9717 
ValueCountFrequency (%) 
02323370.5%
 
1971729.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.6 KiB
0
31554 
1
 
1396
ValueCountFrequency (%) 
03155495.8%
 
113964.2%
 

y
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size514.8 KiB
0
29258 
1
3692 
ValueCountFrequency (%) 
02925888.8%
 
1369211.2%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

agemaritaldefaulthousingloanmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedjob_admin.job_blue-collarjob_entrepreneurjob_housemaidjob_managementjob_retiredjob_self-employedjob_servicesjob_studentjob_technicianjob_unemployedjob_unknowncontact_cellularcontact_telephoneeducation_basic.4yeducation_basic.6yeducation_basic.9yeducation_high.schooleducation_illiterateeducation_professional.courseeducation_university.degreeeducation_unknowny
057100151371199910-1.892.893-46.21.2995099.100000000010010000100000
1551010542852999001.193.994-36.44.8605191.000000000000101000000010
23310005552199910-1.892.893-46.21.3135099.101000000000010001000000
3361000653554999001.494.465-41.84.9675228.110000000000001000100000
4271010751892999001.493.918-42.74.9635228.100010000000010000100000
5581011756051999001.493.918-42.74.9625228.100000100000010000001000
6481010532431999001.193.994-36.44.8565191.000000001000001000100000
751001084247999001.493.444-36.14.9625228.110000000000010000000100
8241011631264999001.494.465-41.84.9625228.100100000000001000000100
936001171434999001.493.918-42.74.9625228.100000000010010000001000

Last rows

agemaritaldefaulthousingloanmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedjob_admin.job_blue-collarjob_entrepreneurjob_housemaidjob_managementjob_retiredjob_self-employedjob_servicesjob_studentjob_technicianjob_unemployedjob_unknowncontact_cellularcontact_telephoneeducation_basic.4yeducation_basic.6yeducation_basic.9yeducation_high.schooleducation_illiterateeducation_professional.courseeducation_university.degreeeducation_unknowny
3294030000082977999001.493.444-36.14.9635228.110000000000010000001000
3294154101085912311-2.992.201-31.40.8495076.210000000000010000000101
3294247101061822999001.494.465-41.84.9615228.100100000000001000000100
32943251010534881999001.193.994-36.44.8585191.001000000000001001000000
32944491010841991999001.493.444-36.14.9635228.101000000000010000100000
32945561001711161999001.493.918-42.74.9605228.100010000000010100000000
3294637100175697999001.493.918-42.74.9575228.100001000000010000000100
3294726000052135499910-1.892.893-46.21.2665099.110000000000010000000100
3294831000041386199900-1.893.075-47.11.4055099.101000000000010001000000
32949391000841791999001.493.444-36.14.9635228.100010000000010100000000

Duplicate rows

Most frequent

agemaritaldefaulthousingloanmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedjob_admin.job_blue-collarjob_entrepreneurjob_housemaidjob_managementjob_retiredjob_self-employedjob_servicesjob_studentjob_technicianjob_unemployedjob_unknowncontact_cellularcontact_telephoneeducation_basic.4yeducation_basic.6yeducation_basic.9yeducation_high.schooleducation_illiterateeducation_professional.courseeducation_university.degreeeducation_unknownycount
0270000713312999001.493.918-42.74.9625228.1000000000100100000010002
1320010741281999001.493.918-42.74.9685228.1000000000100100000010002
233101084591999001.493.444-36.14.9685228.1100000000000100000001002
335101055348499900-1.892.893-46.21.3135099.1100000000000100000001002
436100074881999001.493.918-42.74.9665228.1000001000000010000000102
5391000541241999001.193.994-36.44.8555191.0010000000000010100000002
6411010821271999001.493.444-36.14.9665228.1000000000100100000010002
745100074252199900-2.992.469-33.61.0725076.2100000000000100000001012
855100081331999001.493.444-36.14.9655228.1000000010000100001000002
9561000511361999001.193.994-36.44.8575191.0010000000000011000000002